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The  work  reported  was  carried  out  May  1976  through  April  1977  under 
I ask  1W7627I0AD2702,  Remote  Sensing  Alarm  Techniques.  The  latter  task  constitutes 
authorization  for  the  work. 

Reproduction  of  this  document  in  whole  or  in  part  is  prohibited  except  with 
permission  of  the  Director,  Chemical  Systems  Laboratory,  Attn:  DRDAR-CLJ-R,  Aberdeen 
Proving  Ground,  Maryland  21010;  however,  DDC  and  the  National  Technical  Information  Service 
are  authorized  to  reproduce  the  document  for  United  States  Government  purposes. 
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SPECTRAL  CLASSIFICATION  TECHNIQUES  FOR  REMOTE  SENSING  ALARMS 


I INTRODUCTION. 


The  remote  sensing  ol"  toxic  chemical  agent  clouds  requires  discrimination  techniques 
that  minimize  responses  due  to  background  changes  and  interferences  while  maintaining  adequate 
agent  response.  Because  of  the  large  number  of  possible  variations  in  agents,  backgrounds  and 
interferences,  most  techniques  use  computer  optimization  programs  to  find  a discrimination 
function.  Such  programs  are  usually  designed  to  maximize  the  response  of  some  target  spectrum 
(the  agent),  while  limiting  the  responses  of  all  the  constraint  spectra  (backgrounds  and 
interferences)  to  the  noise  level  of  the  instrument.  Since  computer  capacity  restricts  the  number 
of  constraints  that  can  be  used  at  any  one  time,  there  exists  a need  for  a classification  system  to 
select  a representative  set  of  independent  background  and  interference  spectra  that  serve  as 
adequate  constraints  in  all  situations.  In  addition,  the  system  should  be  capable  of  examining 
newly  obtained  spectra  and  rejecting  those  found  to  be  similar  to  existing  constraints.  The 
purpose  of  this  report  is  to  describe  the  classification  methods  tried  to  date  and  to  construct  a 
file  of  independent  spectra  using  the  most  appropriate  technique. 

The  data  used  to  examine  the  classification  methods  were  difference  energy  spectra 
from  the  exploratory  development  (XD)  Passive  LOPAIR.  It  is  composed  of  two  subsystems: 
a spectroradiometer  and  a discriminator. 

The  spectroradiometer  uses  a circular  variable  filter  (CVF)  and  a nitrogen  cooled 
Hg:Cd:Te  detector.  The  CVF  covers  half  of  a wheel  that  rotates  at  1 Hz.  A chopper,  which  is 
mechanically  connected  to  the  rotating  filter  wheel,  rotates  at  1000  Hz  with  the  detector 
alternately  viewing  the  scene  and  an  internal  blackbody.  This  chopping  produces  an  ac  signal 
whose  peak-to-peak  amplitude  is  proportional  to  the  energy  difference  between  the  background 
and  the  internal  blackbody.  The  ac  signal  is  normalized  by  automatic  gain  control  and 
demodulated  to  produce  a continuous  difference  energy  spectrum  of  the  8.4  to  12.5  pm  region. 
The  AGC  maintains  a constant  amplitude  of  ±3.0  volts  depending  on  whether  the  scene  is  hotter 
or  cooler  than  the  internal  blackbody.  The  field  stop  of  the  CVF  is  set  so  that  the 
spectroradiometer  has  about  1%  spectral  bandwidth,  but  the  actual  spectral  resolution  is 
about  2%. 

The  discriminator  is  a small,  special-purpose,  hybrid  computer,  which  computes  the 
absolute  value  of 


c(X)x(X)d\ 
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w here 


('(A)  is  the  coefficient  value 
X(A)  is  the  signal  value 
A)  is  the  beginning  of  the  first  channel 
A?  is  the  end  of  the  last  channel. 

The  discriminator  can  accept  up  to  sixteen  channels,  where  a channel  is  defined  as  any  part  of 
the  difference  energy  spectrum  with  bandwidth  ranging  from  zero  to  the  full  spectral  range  of 
the  instrument.  To  account  for  minor  variations  in  instrument  responsivities,  this  range  was 
limited  to  8.68  to  1 1.83  pm. 

The  signal  from  each  channel  is  weighted  by  an  op-amp  whose  effective  gain  is 
proportional  to  the  channel  coefficient.  The  weighted  outputs  of  all  channels  are  summed  by  an 
integrating  network  with  a time  constant  of  0.1  sec.  The  integrator  is  discharged  with  a time 
constant  of  2.7  sec,  thus  averaging  the  present  value  with  previous  values  to  improve  the 
signal-to-noise  ratio  of  the  discriminator  output. 

This  output  is  the  instrument’s  response  which  can  be  either  a positive  or  negative 
voltage.  Typically,  the  discriminant  functions  were  designed  to  produce  a positive  response  for  a 
target. 


The  LOPA1R  spectra  were  standardized  by  dividing  by  the  instrument’s  responsivity 
curve,  and  normalized  by  dividing  by  the  maximum  absolute  value.  Both  field  spectra  and 
simulated  field  spectra  from  laboratory  measurements  were  included  in  the  data  file.  The  spectra 
were  initially  categorized  according  to  their  physical  type:  that  is  low-angle  sky,  terrain, 
dust.  etc. 

II.  SPECTRAL  DATA  FILE. 


The  data  file  consists  of  agents,  simulants,  interferences  and  backgrounds.  The 
classification  techniques  are  applied  only  to  the  spectra  being  used  as  constraints:  that  is, 
interferences  and  backgrounds.  The  set  of  backgrounds  includes  low-angle  sky  (LAS),  terrain,  and 
combinations  of  both.  Interferences  include  dusts,  smokes,  decontaminants  and  explosions. 

A.  Backgrounds. 

The  background  data  set  consists  of  86  LAS  and  9 terrain  (or  combination)  spectra. 

The  significant  features  in  any  LAS  spectrum  are  the  intensity  of  the  ozone  peak 
and  the  slope  of  the  spectrum.  For  the  purpose  of  this  study,  the  ozone  intensity  is  defined  as 
the  magniitude  of  the  spectrum  at  9.65  nm,  the  approximate  center  of  the  doublet.  Ozone 
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intensity  aiul  slope  vary  with  changes  in  ambient  temperature,  cloud  cover,  ozone  concentration, 
atmospheric  transmittance,  and  angle  of  observation.  Given  a fixed  angle  of  instrument  elevation, 
cloud  cover  variations  are  responsible  lor  most  spectral  changes  in  a continuous  12-hour  period. 
C hanges  resulting  from  variations  in  ambient  temperature  and  atmospheric  transmittance  arc 
usually  gradual  and  of  a lower  magnitude.  Most  variations  in  o/one  concentration  occur  over  a 
long  time  period  and  have  a small  effect  in  relation  to  the  other  parameters.  At  small  angles 
(0°  to  10°).  changes  in  instrument  elevation  can  produce  significant  variations  in  the  spectrum. - 
These  variations  are  presumed  to  be  equivalent  to  combinations  of  changes  in  the  other 
parameters. 


Figure  1 contains  1 1 LAS  spectra  showing  variations  primarily  in  o/one  intensity. 
These  spectra  came  from  a 12-hour  background  test  that  began  in  partly  cloudy  skies  (last 
spectrum)  and  ended  in  a snowstorm  (first  four  spectra).  The  instrument  had  a fixed  angle 
hout  the  test,  and  the  ambient  temperature  varied  only  5 °C  over  the  entire  run.  The 
changes  are  due  primarily  to  cloud  cover  variations,  with  decreasing  atmospheric 
nice  occurring  as  the  snow  increased. 

Figure  2 shows  five  spectra  having  nearly  the  same  ozone  intensity,  but  different 
slopes.  The  spectra  were  obtained  on  clear  days  in  the  fall,  winter,  and  spring,  and  reflect 
changes  in  all  parameters  except  cloud  cover. 

Figure  3 shows  five  spectra  in  which  both  slope  and  ozone  intensity  vary.  These 
spectra  were-  taken  over  a 5-minute  time  period  with  changes  due  entirely  to  varying  cloud  cover. 

Occasionally,  situations  occur  where  the  LAS  spectrum  contains  a peculiar  feature  of 
uncertain  origin.  Figure  4 shows  one  example  from  a field  test  at  Yuma,  Arizona.  Note  that  the 
ozone  peak  crosses  the  zero  point.  This  implies  that  the  upper  layers  of  the  atmosphere  which 
contain  ozone  are  at  a higher  temperature  than  the  surface  layer,  an  unusual  occurrence  for 
those  altitudes.  The  exact  cause  is  unknown. 

Figure  5 shows  three  spectra  obtained  from  a background  run  at  Aberdeen  Proving 
Ground.  The  first  spectrum  was  typical  of  the  run.  The  remaining  two  spectra  were  taken  from  a 
variation  that  lasted  about  $ minutes  and  caused  a considerable  negative  response  (figure  6).  The 
shape  and  duration  of  the  response  plot  indicates  that  the  probable  cause  was  a solar  transit  near 
the  field  of  view  of  the  instrument. 

Abnormal  occurrences  such  as  those  usually  cause  a response,  but  are  difficult  to 
classify  because  their  features  are  unique.  They  are  generally  added  to  the  contraint  file,  but 
they  probably  would  not  prevent  responses  to  spectra  that  are  similar,  but  not  identical.  Spectra 
such  as  these  are  not  classified  because  of  their  uncertain  origins. 

The  terrain  data  file  consists  of  just  nine  spectra.  The  spectra  of  trees,  water  and  soil 
appear  similar  to  blackbodies,  but  contain  reflected  ozone  components  (figure  7).  Mountains,  and 
other  features  with  varying  surface  compositions  can  have  unusual  and  unpredictable  spectra, 
particularly  under  direct  sunlighting.  Figure  8 shows  three  spectra  of  similar  looking  areas  on  a 
snowcapped  mountain  range  under  nearly  the  same  lighting  conditions.  The  reflected  ozone 


9 


Figure  5.  Two  Spectra  of  Unknown  Origin  and  Their  LAS  Background 


Figure  6.  Negative  Response  from  Spectra  of  Unknown  Origin 
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component  is  present,  but  the  spectra  have  different  slopes  and  two  of  them  cross  zero.  The 
differences  are  apparently  due  to  sun  orientation  relative  to  the  mountain,  and  surface  properties 
of  the  mountain.  Some  portions  of  the  mountain  were  snow  covered  while  others  were  bare.  No 
surface  had  any  vegetative  cover.  These  terrain  spectra  probably  represent  only  a small  part  of  a 
necessary  design  file,  but  are  still  included  in  the  constraint  file.  The  spectra  crossing  zero  arc 
treated  like  the  abnormal  L.AS  spectra.  They  are  used  as  constraints,  but  not  classified. 

B.  Interferences. 


The  interference  data  file  consists  of  55  dust  spectra,  plus  several  assorted  smokes, 
explosions  and  decontaminants.  Computer  simulations  and  field  testing  have  shown  that  the  use 
of  simulated  field  spectra  from  laboratory  data  could  serve  as  adequate  constraints.3  Therefore, 
many  of  the  dust,  smoke  and  decontaminant  spectra  are  simulated.  Since  the  field  spectra  of 
interferences  have  either  a LAS  or  terrain  background,  they  contain  background  variations  in 
addition  to  changes  in  concentration  of  the  interference  itself. 

The  spectra  of  explosions  present  a unique  problem  in  constraining  and 
classification.  Dusts,  smokes  and  decontaminants  have  specific  absorption  bands  and  predictable 
spectral  structure.  According  to  our  data  file,  the  primary  effect  of  an  explosion  is  a sudden 
heating  of  air  in  the  instrument’s  field  of  view  (FOV),  usually  resulting  in  a totally  saturated 
positive  spectrum  with  no  structure.  Intermediate  spectra  before  and  after  saturation  are 
distorted,  with  apparently  random  structure  (figure  9).  Therefore,  present  data  indicate  that 
explosions  produce  unpredictable  spectral  features  that  cannot  be  classified  into  finite  member 
groups. 


As  mentioned  before,  the  dust  data  consist  of  both  simulated-  and  real  field  spectra. 
The  primary  difference  between  them  is  the  background.  The  simulations  have  a plain  blackbody 
background  and  the  real  spectra  have  varying  LAS  or  terrain  backgrounds.  Previous  studies  had 
shown  that  there  were  five  compounds  with  significant  absorption  in  the  8tol3jim  region; 
kaolinite,  illite,  montmorillonite,  silica,  and  calcium  carbonate.4  Figure  10  (a,  b,  c)  shows  the 
three  types  of  simulated  spectra  used  as  constraints.  The  first  contains  kaolinite,  and  the  other 
two  contain  combinations  of  illite,  montmorillonite  and  silica.  Calcium  carbonate,  originally 
omitted  because  soil  samples  indicated  that  it  rarely  occurred,  was  observed  in  later  field  tests 
and  is  constrained  by  field  spectra  such  as  in  figure  10(d). 

The  other  interferences  in  the  data  file  are  HC  smoke  RP  smoke,  and  methyl 
cellusolve,  the  only  component  of  DS2  decontaminant  with  significant  absorption  bands  in  the 
8-tol3-*im  region.  These  interferences  are  also  constrained  by  simulated  spectra  and  have 
response  effects  similar  to  dusts.  Figure  1 1 shows  some  simulated  spectra  of  these  interferences. 

III.  CLASSIFICATION  METHODS. 

Four  classification  methods  were  applied  to  the  spectral  data:  inspection, 
correlation,  linear  programming  and  factor  analysis.  Inspection  was  simply  a visual  comparison  of 
spectral  features.  Correlation  involved  the  calculation  of  a single  numerical  valve  that  indicated 
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C.RP  SMOKE  6000CL 


Figure  1 1 . Three  Interference  Spectra 


how  closely  two  spectra  compared  to  each  other.  Linear  programming  consisted  of  a modification  of 
the  optimization  program  to  examine  the  responses  of  backgrounds  and  interferences.  Finally, 
factor  analysis  involved  a mathematical  attempt  to  transform  the  spectra  into  a small  number  of 
significant  variables. 

To  simplify  the  initial  classification  attempts,  only  LAS  spectra  were  used.  Terrains 
and  interferences  were  examined  if  the  specific  method  seemed  to  be  successful  with  LAS.  LAS 
was  chosen  because  the  data  set  was  larger  and  more  varied  than  the  others.  In  addition,  most  of 
the  other  field  spectra  had  LAS  backgrounds  or  reflected  LAS  components,  making  their 
classification  partially  dependent  on  the  LAS  background. 

A.  Inspection. 


The  process  of  visual  inspection  began  as  soon  as  field  spectra  were  obtained.  As  the 
number  of  spectra  in  the  data  file  increased,  the  time  and  expense  for  using  the  computer 
optimization  program  increased.  Since  the  program  had  a reasonable  limit  of  65  constraints,  it 
soon  became  necessary  to  select  65  spectra  from  a much  larger  data  file. 

The  selection  procedure  was  straightforward.  Spectra  having  approximately  equal 
slopes  and  ozort*  intensities  were  compared  to  each  other  and  placed  in  groups  normally 
consisting  of  2 to  10  spectra.  One  spectrum  from  each  group,  plus  the  individual  spectra  that 
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omul  not  Iv  grouped.  would  Ik  added  to  I lie  constraint  sol.  lor  cx:mi  pie,  I inure  12  shows  a 
group  ol  seven  spectra  having  similar  eharaclerislies,  with  the  lirsl  two  being  nearly  identical. 
One  spectrum  would  he  selected  to  represent  the  entire  group.  A set  ol  channels  and  coefficients 
would  he  generated  with  these  constraints,  ami  the  simulation  program  would  he  used  to  check 
the  responses  of  spectra  not  in  the  constraint  file. 


8*5  WAVELENGTH  Igml  1£'u 

Figure  1 2.  Similar  Spectra 

In  the  majority  of  cases,  the  technique  provided  adequate  constraints  for  the 
existing  spectra!  data,  but  there  were  problems.  A typical  solution  for  any  set  of  channels  and 
coefficients  limited  the  constraint  response  to  the  ±0. 3-volt  range.  For  any  set  of  constraints, 
only  those  spectra  whose  responses  reached  the  maximum  limits  constrained  the  solution:  usually 
10  to  20  out  of  the  65.  Given  the  same  initial  set,  different  targets  had  different  contributing 
spectra.  Examination  of  16  different  solutions  involving  five  different  targets  indicated  that, 
while  some  spectra  affected  almost  all  the  solutions  (figure  13  for  example),  there  were  others 
that  never  reached  ±0.3  volts.  With  few  exceptions,  there  were  no  visual  clues  to  explain  why  a 
spectrum  did  or  did  not  contribute  to  a solution.  The  inability  to  select  the  dominant  constraint 
spectra  visually  for  any  target  placed  a severe  limitation  on  the  utility  of  visual  inspection.  For 
newly  obtained  spectra,  reliable  classifications  required  a new  spectrum  to  be  nearly  identical  to 
one  already  included  in  the  constraint  file. 


There  were  other  drawbacks  as  well.  First,  the  technique  was  basically  subjective, 
and  results  could  vary  from  person  to  person.  In  addition,  the  technique  required  all  new  spectra 
to  be  recorded  and  precisely  plotted  prior  to  classification.  One  of  the  goals  of  classification  was 
to  eliminate  the  need  for  recording  redundant  spectra.  The  obvious  solution  was  to  seek  a 
mathematical  technique  that  would  be  accurate,  consistent,  and  fully  automatic. 
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B.  Correlation  Analysis. 

The  inspection  technique  grouped  spectra  according  to  their  spectral  characteristics. 
This  method  was  limited  by  the  fact  that  there  was  no  mathematical  basis  for  the  technique. 
Basic  correlation  theory  provides  a mathematical  technique  of  comparing  two  functions 
dependent  on  the  same  parameter.5  Given  two  spectra,  the  calculation  of  the  correlation 
coefficient  produces  a single  numerical  value  for  the  degree  of  similarity  between  them. 

1.  Theory. 

A spectrum  can  be  represented  by  an  N-dimensional  vector  where  N is  the  number 
of  spectral  intervals.  Given  two  spectra  with  signal  values  Xj  and  yj,  the  vector  representations  are 

N N 

X * 2 8jXj  and  Y = I e,yj 


where 


e is  the  unit  vector 
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I loin  wctoi  algebra.  I ho  dot  product  is  given  by 

N 

X Y = 1 xjy,  (I) 

Ihe  lengths  ol  tho  vector;*  ;iro 

/N  W ( N \'A 

1X1=  X = \2*r)  and  I Y I = Y = \ £ yj~  / (2) 

file  ratio  of  tho  dot  product  to  tho  product  of  the  lengths  is  the  direction  cosine  of 
the  angle  between  them.  In  this  case 


cos  a 


X-Y 

X Y 


(3) 


By  definition,  cos  a is  the  correlation  coefficient. 

Since  the  dot  product  of  two  vectors  is  less  than  or  equal  to  the  product  of  the 
lengths,  it  is  clear  that  the  absolute  value  of  the  correlation  coefficient  is  less  than  or  equal 
to  I . 


Letting  K represent  the  correlation  coefficient  and  applying  equations  (1-3)  gives  the 
following  easily  calculated  expression5 


In  applying  correlation  to  the  spectral  data  file,  the  values  xj  and  y;  are  the  signals 
of  two  spectra  at  .01  pm  intervals  from  8.68  to  1 1.83  |im.  The  correlation  coefficient  can  assume 
any  value  from  1 to-1.  Identical  spectra  will  have  correlations  of  1,  extremely  noisy  spectra 
could  have  correlations  approaching  0,  and  two  spectra  having  hot  and  cold  backgrounds  of  equal 
difference  energy  would  have  a correlation  of  -1 . 

2.  Results. 

A computer  program  was  written  to  calculate  the  correlation  coefficients  for  an 
array  of  up  to  72  spectra.  The  program  operator  had  the  option  of  selecting  either  a numerical 
table  or  a plot  of  the  correlation  of  any  spectrum  with  the  other  71  in  the  array.  In  addition, 
other  spectra  not  in  the  original  array  could  be  correlated  without  changing  the  original  group. 
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When  correlation  analysis  hugun,  there  were  59  LAS  spectra  in  the  data  file.  All  were 
placed  into  an  array,  and  correlation  coefficients  calculated  showing  the  relationship  or  each 
spectrum  with  every  other  one  in  the  array.  The  coefficients  for  these  spectra  ranged 
Irom  1 .0000  (figure  14)  to  0.899 I (figure  15). 

The  next  step  was  to  establish  the  relationship  between  the  response  difference  and 
the  correlation  coefficient  for  two  spectra.  Since  the  normal  constraint  response  limit  was 
±0.3  volts,  response  values  for  existing  sets  of  channels  and  coefficients  were  examined  to  find 
the  lowest  correlation  value  that  would  still  correspond  to  a response  difference  within 
±0.3  volts.  This  value  was  empirically  determined  to  be  0.9998.  Thus,  if  two  spectra  had  a 
correlation  value  of  0.9998  or  greater,  they  would  be  considered  the  same  constraint. 

Having  established  the  criterion  for  correlation,  some  questions  had  to  be  answered 
concerning  its  interpretation  and  effectiveness.  The  first  question  was  whether  or  not  the 
correlation  values  alone  could  be  used  to  select  a set  of  LAS  spectra  that  would  effectively 
constrain  the  entire  group.  A listing  was  made  of  the  59  LAS  spectra.  The  correlation  values  for 
each  spectrum  were  examined  and  all  other  spectra  that  correlated  to  within  0.9998  or  better 
were  listed  next  to  that  spectrum.  When  this  was  completed,  the  list  was  examined  to  find  the 
least  number  of  spectra  that  would  constrain  the  entire  group.  A set  of  29  were  chosen,  and 
using  DMMP  as  a target,  channels  and  coefficients  were  computed  for  this  constraint  set.  The 
solution  was  then  applied  to  the  simulation  program  to  obtain  response  values  for  the  other 
30  spectra.  In  every  case,  the  responses  of  the  unconstrained  spectra  were  within  ±0.3  volts  of 
their  respective  constraints.  This  result  was  expected  because  of  the  similarity  between  groups 
selected  visually  and  groups  selected  by  correlation.  The  advantage  of  the  correlation  method  was 
in  having  a mathematical  basis  for  the  selection  process. 

At  this  point,  it  was  obvious  that  given  the  proper  hardware,  the  correlation 
technique  could  be  used  to  monitor  spectra  automatically  during  field  tests  and  eliminate  those 
that  were  similar  to  existing  constraints.  However,  many  of  the  problems  that  occurred  in  visual 
analysis  were  occurring  in  correlation  analysis.  As  mentioned  before,  for  any  discriminant 
function  there  existed  a few  dominant  spectra:  7 out  of  29  for  this  constraint  set.  According  to 
their  correlation  values  these  seven  spectra  were  expected  to  constrain  only  1 7 spectra,  not  all  59 
in  the  data  file.  From  a visual  standpoint  there  was  nothing  unusual  about  these  seven  spectra 
and  correlation  analysis  provided  no  additional  information.  Another  problem  occurred  when 
correlation  values  for  series  of  spectra,  such  as  in  figure  1,  were  compared  to  their  respective 
responses.  Using  the  highest  ozone  spectrum  as  a base,  correlation  values  decreased  continuously 
in  going  from  the  highest  to  the  lowest  ozone  intensity.  With  one  exception,  responses  values  for 
various  sets  of  channels  and  coefficients  were  neither  continuously  decreasing  nor  continuously 
increasing.  Thus,  given  a spectrum  in  the  middle  of  a series,  its  response  could  not  be  estimated 
from  its  correlation  value  and  the  responses  of  the  other  spectra. 

The  final  step  in  the  correlation  approach  was  to  apply  the  technique  to  the  set  of 
interference  spectra.  An  obvious  shortcoming  occurred  when  the  technique  was  used  with 
simulated  spectra.  Field  spectra  that  had  been  constrained  by  simulated  spectra  did  not  correlate 
with  them.  As  a result,  the  technique  could  not  be  used  with  simulated  spectra,  and  thus  had 
very  limited  application  for  interference  testing. 
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Figure  14.  LAS  Spectra  with  Correlation  Value  of  1.0000 
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I'ltc  overall  conclusion  of  the  correlation  technique  is  that  it  provides  a limited 
capability  to  automatically  classify  field  spectra.  The  method  is  limited  to  comparisons  of  nearly 
identical  spectra.  (liven  a file  of  different  spectra,  it  cannot  select  the  ones  that  control  the 
solutions  of  the  optimization  program. 

('.  Linear-Programming  Method. 

The  first  two  classification  techniques  had  significant  limitations.  They  could  not 
select  the  dominant  spectra  within  a data  file  or  constraint  set.  Also,  they  could  not  be  used  to 
estimate  response  values  of  spectra  that  did  not  look  like  or  correlate  with  existing  constraints. 
Since  the  success  or  failure  of  any  technique  ultimately  depended  on  the  response  of  a spectrum 
to  a given  channel  and  coefficient  solution,  a logical  choice  for  examination  was  the  optimization 
program  that  calculated  these  solutions. 

The  first  task  was  to  determine  how  the  program  could  be  used  as  a classification 
method.  In  its  normal  form,  the  program  solves  a multivariable  linear  equation  for  a given  set  of 
constraint  spectra  and  a given  target.  Using  the  notation  of  Flanigan,1  the  program  finds  a set  of 
vectors  W which  satisfy 


IW-fj  It  e 

where 

the  fj  are  linearly  independent  constraint  spectra  and 
e is  the  maximum  allowable  constraint  response. 

The  program  continues  and  finds  a solution  W from  the  set  of  W vectors  that  maximizes 


R = W • ft 

where 

ft  is  the  target  spectrum  and 
R is  the  response  to  the  target. 

If  the  target  spectrum  was  replaced  by  a newly  obtained  background  or  interference  spectrum 
the  program  would  compute  the  maximum  absolute  response  value  possible  for  that  spectrum 
and  constraint  set.  If  this  response  was  within  some  specified  limit,  the  spectrum  could  be 
eliminated  from  the  group  of  potential  constraints. 

The  optimization  program  was  modified  to  remove  the  portions  not  required  for  the 
classification  study.  Since  the  program  required  a set  of  constraints  to  produce  a target  response, 
an  initial  set  had  to  be  chosen  using  some  other  technique.  A set  of  16  LAS  spectra  were 
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selected  based  on  physical  characteristics.  This  initial  selection  required  subjective  judgements 
that  would  be  acceptable  provided  the  rest  of  the  program  produced  decisive  results. 

I'lte  most  difficult  step  was  determining  the  response  criterion  for  the  elimination  ol 
spectra.  Spectra  not  included  in  the  constraint  set  would  probably  have  maximum  response 
values  that  were  outside  the  normal  range  of  ±0.3  volts.  The  new  limit  had  to  be  high  enough  to 
eliminate  common  spectra  but  low  enough  to  prevent  the  elimination  of  significant  spectra.  A 
value  of  three  times  the  normal  constraint  limit  was  chosen  as  the  maximum-allowed-responsc 
range.  This  meant  that  a spectrum  eliminated  by  this  method  could  have  as  much  as  a ±0. 0-volt 
response  for  a solution  designed  to  detect  an  agent.  However,  the  actual  response  of  a spectrum 
to  a solution  calculated  for  an  agent  was  expected  to  be  much  less  than  the 
maximum-allo wed-response  value. 

The  classification  procedure  was  simple.  Starting  with  the  basic  constraint  set,  the 
remaining  70  LAS  spectra  (27  more  had  been  added  since  the  correlation  study)  were  inserted 
individually  as  target  spectra  in  the  modified  program.  If  the  response  of  the  spectrum  exceeded 
±0.9  volts,  it  was  added  to  the  constraint  set  and  the  process  continued  with  the 
enlarged  constraint  set  and  remaining  spectra. 

The  response  values  obtained  from  this  procedure  were  larger  than  anticipated.  It 
was  hoped  that  most  of  the  spectra  would  have  responses  close  to  the  normal  range  of 
±0.3  volts,  and  that  a few  would  have  much  higher  responses  that  exceeded  the  ±0. 9-volt  range. 
In  fact,  only  6 of  the  70  spectra  had  responses  within  ±0.6  volts  and  37  had  responses  exceeding 
±0.9  volts:  almost  the  opposite  of  the  desired  result.  Thus,  a total  of  53  contraints  was  required 
from  the  data  file  of  86  LAS  spectra.  From  a percentage  basis,  this  number  of  constraints  was 
too  large  to  allow  expansion  of  the  method  to  the  entire  data  file.  Some  thought  was  given  to 
increasing  the  maximum-allowed  response,  but  the  potential  decrease  in  the  number  of 
constraints  would  have  been  offset  by  a decrease  in  the  reliability  of  the  method.  Procedural 
changes  were  also  considered,  but  none  appeared  likely  to  make  any  significant  improvement. 

These  results  indicated  that  the  linear-programming  technique  was  unsuitable  for 
classification  purposes.  The  technique  required  subjective  judgements  to  initiate  the  classification 
process,  and  thus  required  that  the  resulting  response  values  be  capable  of  clearly  defining  the 
significant  spectra  in  the  data  file.  The  actual  response  values  did  not  produce  decisive  results, 
and  reliable  classifications  could  not  be  obtained.  Nevertheless,  the  technique  provided  some 
essential  information  concerning  the  effectiveness  of  classification  methods. 

All  of  the  classification  techniques  rely  on  a basic  assumption  that  two  spectra  from 
a data  set  can  be  used  to  constrain  a third  spectrum  not  included  in  that  set.  This  assumption 
was  supported  by  field  tests  and  computer  simulations.  The  results  of  the  linear-programming 
attempt  at  classification  indicated  that  this  assumption  might  not  always  be  valid.  Since  most  of 
the  spectra  eliminated  by  this  method  had  responses  between  0.6  and  0.9  volts,  there  was  at  least 
that  one  solution  in  which  these  spectra  would  not  be  adequately  constrained  to  ±0.3  volts  by 
the  constraint  file.  There  was  no  guarantee  that  other  problem  solutions  did  not  exist  and  that 
some  future  agent  solution  might  not  have  a false  alarm  if  one  of  these  so-called  insiginificant 
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spectra  were  encountered.  This  latter  problem  could  exist  for  any  spectrum  eliminated  as  a 
constraint  by  any  classification  method.  The  experience  to  date  indicates  that  it  is  unlikely  to 
occur,  but  any  future  work  should  examine  this  problem  in  more  detail. 

I).  Factor  Analysis. 

The  llrst  three  classification  techniques  had  very  limited  success.  None  of  them  were 
able  to  select  the  significant  spectra  from  a data  file  and  reduce  the  number  of  constraints  to  a 
workable  level.  When  applied  to  the  entire  data  set,  at  least  100  spectra  would  have  been 
required  as  constraints  by  each  of  the  methods.  In  addition,  each  of  the  methods  required 
subjective  judgements  to  select  some  of  the  constraints.  Any  new  technique  should  eliminate  the 
subjective  choosing  of  constraints,  isolate  the  significant  spectra  or  the  significant  features  within 
a given  spectrum,  and  provide  all  the  information  required  to  pick  suitable  constraints. 

1.  Theory. 

Factor  analysis  is  a procedure  to  find  a new  set  of  variables  (called  factors)  which 
describe  a set  of  data.6  In  this  case,  the  data  are  a collection  of  spectra  with  common 
characteristics.  Each  spectrum  consists  of  a signal  measured  at  N different  wavelengths  and  can 
be  treated  as  an  N-component  vector  Sj  where  the  subscript  designates  a particular  spectrum  in 
the  collection. 


j = 1,2,  n 


Sjj  is  the  signal  of  the  j,h  spectrum  measured  at  the  ith  wavelength.  There  are  n spectra  in  the 
collection. 

The  first  step  in  factor  analysis  is  to  convert  the  raw  data  to  a standardized  form  in 
order  to  simplify  the  mathematics.  At  each  wavelength,  the  average  signal  Sj  and  the  variance  Oj- 
are  computed  from 
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I he  ra \\  data  variables  S-  arc  then  transformed  to  standardized  variables 


Standardized  vectors  Z] . each  representinga  particular  spectrum,  may  be  combined  into  a matrix 
Z containing  all  the  spectra  in  the  collection.  Each  column  of  Z is  one  spectrum  while  each  row 
represents  a specific  wavelength.  The  size  of  Z is  N X n. 

The  fundamental  assumption  of  factor  analysis  is  that  each  standardized  variable  Zj. 
can  be  written  as  a linear  combination  of  new  variables  or  factors.  The  (actors  are  considered  to 
he  hypothetical  constructs  whose  general  nature  is  unknown  but  which  can  be  calculated  for  a 
specific  case.  The  number  of  factors  needed  to  reproduce  the  original  data  exactly  is  equal  to  the 
number  of  original  data  variables.  In  practice,  it  is  usually  possible  to  reconstruct  the  original 
data  to  an  acceptable  degree  of  accuracy  with  a smaller  number  of  factors.  The  basic  model  of 
factor  analysis  can  be  written  as 

m 

Zj  ^ aik*k 

Each  7 j is  the  standardized  signal  at  the  i,h  wavelength  for  an  arbitrary  spectrum.  A total  of 
m factors  fj.  are  being  used.  The  cr’s  are  constants  and  aj^  is  called  the  loading  ol  factor  k on 
variable  i.  The  loadings  are  correlations  between  the  original  variables  and  the  factors,  and  are 
found,  as  shown  below,  from  the  eigenvalues  and  eigenvectors  of  the  data  correlation  matrix  R. 

Once  the  matrix  Z of  standardized  data  has  been  constructed,  it  is  an  easy  matter  to 
calculate  the  correlation  matrix  R since 


R - -7  ZZ1  (6) 

N 

The  prime  denotes  the  transposition  of  the  matrix.  The  matrix  R is  square  and  symmetric  with 
1’s  along  the  main  diagonal.  Each  element  is  the  statistical  correlation  between  standardized 
variables  Zj  and  Zj. 

As  mentioned  above,  the  loadings  are  derived  from  the  eigenvalues  and  eigenvectors 
ofR.  Techniques  for  finding  eigenvalues  and  eigenvectors  can  be  found  in  reference  7.  Let  X,, 

X2 XN  be  the  eigenvalues  such  that  X,>X2>..>XN  and  V,.V2 VN  the  corresponding 

eigenvectors.  The  loadings  may  be  found  from 
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i = I,  N . k = I,  •••,  N 


Ak  vjk 


The  number  m of  factors  to  be  used  lor  a given  set  of  data  can  be  determined  from 
the  eigenvalues.  The  ratio 


in 


2 

J = 1 


N 


j 


2 


(7) 


is  the  proportion  of  the  variation  in  the  data  reproduced  by  the  m factors. 

The  a’s  may  be  combined  into  an  NXm  matrix  A.  The  basic  model  of  factor 
analysis  may  now  be  written  as 


Z = A F 


(8) 


where  F is  the  matrix  of  factor  scores. 

Substituting  equation  (8)  into  equation  (6)  and  using  the  fact  that  the  transposition  of  the 
product  of  two  matrices  is  equal  to  the  product  of  their  transpositions  in  reverse  order,  one 
obtains 


The  quantity  in  parenthesis  is  the  dot  product  of  a set  of  orthogonal  vectors  with  itself  and  is 
therefore  equal  to  the  identity  matrix.  The  result  is  called  the  fundamental  equation  of  factor 
analysis.8 


R = AA' 


(9) 
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I Ik'  circumflex  signifies  that  this  R is  the  matrix  of  reproduced  correlations.  In  general,  it  will  be 
identical  with  the  original  correlation  matrix  R only  if  N factors  are  used.  Comparison  of  p and 
R shows  the  extent  to  which  the  number  of  factors  used  in  the  calculation  accurately  reproduces 
the  original  data. 

Another  useful  matrix  equation  can  be  obtained  by  considering  the  product  of  A' 
and  A Using  several  theorems  from  linear  algebra,  it  can  be  shown  that 

A = A'  A HO) 


where  A is  a diagonal  matrix  whose  diagonal  elements  are  the  eigenvalues  Xj.7  The  size  of  A is 
m X m. 


It  is  possible  to  generate  a more  useful  set  of  loadings  by  forming  linear 
combinations  of  the  original  loadings.  The  new  loadings  are  referred  to  as  rotated  loadings  while 
the  original  loadings  are  described  as  unrotated  loadings.  The  names  are  derived  from  the 
procedure  for  calculating  the  new  loadings  which  consists  of  a series  of  rotations  in  an 
111-dimensional  vector  space.  The  rationale  for  calculating  a new  set  of  loadings  stems  from  the 
manner  in  which  the  different  sets  of  loadings  correlate  with  the  original  variables.  The 
unrotated  loadings  generally  have  a high  correlation  with  many  of  the  original  variables  while  the 
rotated  loadings  correlate  highly  with  just  a few  variables.  It  should  be  easier  to  gain  insight  into 
the  relationship  between  the  original  variables  which  have  a high  correlation  with  a rotated 
loading  because  of  the  smaller  number  of  variables. 

Although  many  procedures  exist  for  calculating  the  rotated  loadings  the  Kaiser 
Varimax  method  as  described  in  references  6 and  8 was  used.  The  matrix  of  rotated  loadings  B is 
computed  from  A by  an  orthogonal  transformation  matrix  T. 


as 


In  terms  of  the  rotated  loadings,  the  basic  model  of  factor  analysis  may  be  written 


z = bfr 


(12) 


The  subscript  on  FR  identifies  the  rotated  factors  which  are  different  from  the  unrotated 
factors  F. 


Even  though  the  general  nature  of  the  factors  is  unknown,  specific  values  of  each 
factor  can  be  calculated  for  a given  Zj.  These  numbers  are  referred  to  as  factor  scores.  A set 
(vector)  of  factor  scores  FRj  is  computed  for  each  standardized  signal  Zj  and  is  used  to  classify 
each  signal.  The  size  of  pRj  is  m X 1,  Premultiplying  ( i 2)  by(B'  the  factor  scores  can  be 

found  from 
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1 


FRj  = B’B  B'  Zj 

Unfortunately,  il  is  usually  not  practical  to  use  equation  (13)  to  calculate  the  factor 
scores  because  it  is  necessary  to  use  the  full  N X N matrix  for  B when  computing  the  inverse.  This 
means  that  all  N loadings  must  be  found  and  rotated.  It  would  be  preferable  to  perform  the 
calculation  by  use  of  only  m loadings.  This  accomplished  with  the  following  equation  which  is 
derived  in  appendix  A. 

FRj  = B'A  A-2  A'Z 


2.  Results. 

A computer  program  was  written  to  calculate  all  the  required  parameters  needed  to 
produce  a set  of  rotated  loadings  and  corresponding  factor  scores  for  data  sets  of  up  to  200  spectra. 
Computer  subroutines  from  the  International  Mathematical  and  Statistical  Libraries,  Inc.  were  used 
to  calculate  A.  B and  T9  The  program’s  output  included  listings  of  variances,  eigenvalue  ratios  and 
factor  scores,  plus  plots  of  the  mean  spectrum,  standard  deviation,  loadings,  and  reconstructed 
spectra.  There  was  no  preconceived  idea  as  to  what  the  results  would  be  or  how  they  would  be  used 
for  classification.  It  was  hoped  that  the  data  would  be  reduced  to  a small  group  of  factors  and 
loadings  that  represented  the  most  significant  spectral  properties. 

As  before,  the  technique  was  first  applied  to  the  set  of  LAS  spectra.  The  program 
divided  the  spectra  into  63  intervals  of  0.05  pm  from  8.68  to  11.83  pm,  the  same  as  in  the 
optimization  program.  This  meant  that  a maximum  of  63  loadings  and  factor  scores  would  be 
required  to  reproduce  a spectrum.  The  first  step  was  a quick  calculation  of  the  eigenvalue  ratios  for 
the  first  20  factors  (expression  7).  The  cumulative  percentage  value  in  table  1 is  the  eigenvalue  ratio 
and  represents  the  proportion  of  variation  reproduced  by  the  number  of  factors;  thus,  the  first  20 
out  of  63  factors  were  responsible  for  99.98 % of  the  variation.  Clearly  as  the  number  of  factors 
increases,  the  importance  of  the  individual  factor  to  the  reconstruction  decreases.  Ten  factors  were 
chosen  for  the  initial  classification  attempt,  and  rotated  loadings  and  factor  scores  were  calculated 
for  the  86  LAS  spectra. 


Table  1 . Eigenvalues  and  Eigenvalue  Ratios  for  the  First  20  Factors  from  the  LAS  Data  File 


Factor  number 

Eigenvalue 

Cumulative 

percentage 

i 

Factor  number 

Eigenvalue 

Cumulative 

percentage 

> 

32.6060 

51.76 

11 

.0128 

99.92 

2 

24.2045 

90.18 

12 

.0099 

99  93 

3 

3.2292 

95.30 

13 

.0065 

99.94 

4 

2.1455 

98.71 

14 

.0049 

99.95 

5 

.3619 

99.28 

15 

.0046 

99.96 

6 

.1687 

99.55 

16 

.0038 

99.96 

7 

.0971 

99.70 

17 

.0028 

99.97 

8 

.0605 

99.80 

18 

.0026 

99.97 

9 

.0382 

99.86 

19 

.0020 

99.98 

10 

.0234 

99.90 

20 

.0018 

99.98 

28 
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At  this  point,  the  result inji  data  arc  examined  m an  attempt  to  apply  them  to  the 
classification  study.  As  mentioned  before,  the  loadings  are  correlations  between  the  original 
variables  and  the  laelors.  The  plots  of  the  rotated  loadings  (figure  l(>)  showed  decreasing  correlation 
values  with  an  increase  in  factor  number,  again  verifying  the  decreasing  significance  of  higher  order 
factors.  I lie  factor  scores  themselves  consisted  of  positive  and  negative  numbers  ranging  from  -.VS 
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Figure  16.  First  10  Rotated  Loadings  from  LAS  Data  File 
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lo  + 5.5  lor  i Ik*  H(>  l.AS  spectra.  An  assumption  was  made  that  the  maximum  and  minimum  scores 
lot  each  factor  might  create  a houndary  condition  for  the  data  set.  Those  spectra  having  one  or 
more  factors  on  the  boundary  might  he  the  most  significant  spectra  in  the  data  file,  and  as  such, 
would  he  constraint  candidates.  Such  a selection  would  be  entirely  based  on  numerical  data  and 
would  not  require  any  subjective  judgements. 

I here  were  23  spectra  (table  2)  that  had  maximum  or  minimum  scores.  (Spectra  are 
illustrated  in  appendix  B).  In  some  cases,  several  spectra  had  the  same  score  for  the  same  factor;  in 
others,  there  were  spectra  having  more  than  one  factor  on  the  boundary. 


fable  2.  Factor  Scores  for  LAS  Spectra  Meeting  the  Boundary  Conditions 


1 

Factor  number 

i 

n 

B 

3 

B 

5 

B 

m 

8 

9 

■a 

t 

, 

-1.7 

2.9 

- 1.3 

.25 

.01 

-.70 

1.00 

.81 

1.6 

-2.6 

-.76 

-.08 

.86 

-.29 

.55 

-.19 

-.89 

-.45 

-2.6 

3 

-.18 

2.4 

-1.6 

.58 

.38 

1.2 

.75 

2.2 

1.3 

- 1.4 

4 

l.l 

-.97 

.15 

-.41 

.62 

-.68 

2.6 

.44 

.10 

5 

-2.1 

.07 

-.1 1 

-2.0 

-.21 

.94 

3.1 

- 1.5 

-.77 

-.16 

6 

-.76 

-.52 

-.01 

1.2 

-.91 

-.48 

-.64 

-2.6 

-.07 

-.62 

7 

1.5 

.81 

- 1.3 

-.68 

.73 

.64 

1.9 

- 1.6 

-2.5 

-.22 

8 

1.4 

.62 

- 1.1 

-.47 

.59 

.71 

1.8 

- 1.5 

-2.5 

-.18 

9 

.98 

- 1.5 

-.06 

-.31 

-.28 

-.46 

.26 

.96 

.13 

O 

mm 

.98 

- 1.5 

-.01 

-.56 

-.26 

-.25 

.41 

1.3 

.35 

M 

.99 

- 1.5 

-.01 

-.88 

-.42 

.17 

.90 

1.5 

.64 

-.46 

12 

-1.5 

-.05 

- 1.0 

-.60 

-.11 

1.8 

1.1 

.89 

13 

.97 

- 1.5 

.02 

- 1.1 

-.74 

-.18 

2.2 

1.1 

.61 

.76  I 

14 
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-.43 

1.4 

2.5 

2.1 

- 1.3 

1.2 

1.7 

-.72 

2.2 

15 

.47 

1.1 

1.2 

.20 

2.6 

.92 

-.25 
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2.5 

- 1.5 

16 

-3.0 

.18 
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17 

mx | 
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1.6 

- 1.8 

.38 

18 
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-.87 

1.1 

.90 

.09 

.66 

.53 

19 

.82 

.81 
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-.85 

-.49 

.43 

- 1.1 

.02 
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.69 

2.1 
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.30 

-.55 

.69 
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1.5 

.63 
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- 1.4 
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2.0 
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1.2 

2.2 

22 

.42 

.68 
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2.7 

.77 
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.05 

23 

.72 

2.3 

.34 

-2.5 

-.22 

• 3.8 

- 1.3 

.00 

- 1.8 

.39 

I how  23  spectra  were  then  put  into  a file  anil  used  as  constraints  lor  an  agent 
target  in  the  optimization  program.  The  resulting  set  of  channels  and  coefficients  were  then  used 
m the  simulation  program  to  obtain  the  responses  of  all  the  spectra.  Many  of  the  l.AS  spectra 
had  responses  within  ±0.3  volts.  Only  one  L.AS  had  a response  exceeding  ±0.6  volts.  These  results 
indicated  that  the  technique  was  promising.  A slight  increase  in  the  number  of  factors  might 
produce  a reasonably  sized  constraint  set  that  would  constrain  the  entire  file  to  within  the 
desired  ±0.3  volts.  In  addition,  the  technique  might  also  be  applied  to  the  entire  data  file  with 
little  difficulty. 

A new  file  of  162  spectra  was  established  that  consisted  of  all  LAS,  terrain,  and 
interference  spectra  except  those  in  which  part  of  the  spectrum  was  positive.  As  before,  the 
eigenvalues  for  the  first  20  factors  were  obtained  as  a guide  in  selecting  the  number  of  factors  to 
be  used  lor  classification.  Fifteen  factors,  representing  99.91%  of  the  variation  were  chosen  The 
15  loadings,  and  corresponding  factor  scores  for  each  spectrum  were  calculated.  The  factor  scores 
ranged  from  -8.3  to +7.0  for  the  162  spectra  in  the  file.  The  total  number  of  spectra  having  one 
or  more  boundary  factor  scores  was  27.  If  these  27  proved  to  be  capable  of  constraining  the 
entire  group  of  1 62,  then  factor  analysis  would  be  the  most  promising  classification  technique. 

Using  these  27  spectra  as  constraints,  a set  of  channels  and  coefficients  were 
obtained.  This  solution  was  used  in  the  simulation  program  to  obtain  response  values  for  every 
spectrum  in  the  data  file.  Kach  of  the  162  spectra  had  response  values  within  ±0.6  volts,  a 
significant  result.  It  should  be  noted  that  in  both  groups  of  data,  an  eigenvalue  ratio  of 
approximately  99.9 1%  resulted  in  response  values  within  ±0.6  volts.  These  results  indicated  that 
the  technique  should  produce  reasonable  constraints  if  the  number  of  factors  was  increased  to 
obtain  a higher  eigenvalue  ratio. 

The  limiting  eigenvalue  ratio  was  then  increased  to  99.98%,  and  loadings  and  factor 
scores  were  obtained  for  both  the  LAS  file  and  the  larger  mixed  file.  This  resulted  in  19  factors 
and  32  constraints  for  the  LAS  file,  and  27  factors  and  37  constraints  for  the  mixed  file.  Channel 
and  coefficient  sets  for  two  different  agents  were  obtained  for  both  constraint  sets,  and  the 
solutions  were  entered  into  the  simulation  program  to  obtain  response  values.  For  the  LAS 
constraints,  the  remaining  54  LAS  spectra  were  constrained  to  within  ±0.4  volts  with  the 
exception  of  five  spectra  from  one  solution  and  two  spectra  from  the  other.  For  the  mixed 
constraints,  the  remaining  1 25  spectra  were  constrained  to  within  ±0.4  volts  with  the  exception 
of  three  spectra  from  one  solution  and  two  spectra  from  the  other.  The  highest  response  for  any 
of  the  above  exceptions  was  0.605  volts.  These  results  indicated  that  the  factor  analysis 
technique  coupled  with  the  established  procedure  for  selecting  constraints,  was  the  most  effective 
and  reliable  of  the  four  methods  examined.  Given  a large  data  file,  the  technique  will  select  a 
reasonably  sized  set  of  spectra  that  will  constrain  the  entire  data  file  adequately  to  within  some 
specified  response  limit.  The  technique  is  specific  and  requires  no  subjective  selection  of  any 
spectra. 


One  final  question  was  the  ability  of  factor  analysis  to  classify  new  spectral  data. 
Based  on  the  previous  results,  it  was  assumed  that  any  new  spectrum  having  factor  scores  less 
than  the  established  boundary  scores  would  be  adequately  represented  by  existing  constraints. 
This  assumption  was  tested  by  examination  of  five  dust  and  five  LAS  spectra  not  included  in  any 
of  the  previous  trials.  The  27  factor  scores  from  the  mixed-constraint  set  were  calculated  for  the 
new  spectra  and  compared  to  the  boundary  scores.  Only  one  spectrum,  the  LAS  in  figure  4.  had 
factor  scores  exceeding  the  boundary  limits.  All  10  spectra  were  placed  in  the  simulation 
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program  ami  responses  obtained  lor  the  two  agent  solutions  lor  the  larger  data  set.  None  of  the 
new  speetra  had  responses  exceeding  the  *0.4  volt  range.  Basically,  this  supported  the  assumption 
that  I actor  scores  within  the  boundary  implied  that  the  spectrum  was  not  significant:  however, 
since  the  spectrum  having  scores  outside  the  limit  did  not  exceed  the  response  range,  the 
assumption  is  still  uncertain  A complete  answer  to  this  question  requires  many  more  examples 
m order  to  obtain  a statistical  basis  for  the  final  answer. 

The  overall  conclusion  of  the  factor  analysis  study  is  that  the  technique  can  select  a 
small  group  of  representative  constraints  from  a large  data  file.  The  method  has  the  potential  for 
automatic  implementation  and  could  prove  to  he  capable  of  classifying  new  spectra  in  real  time. 

IV  CONCLUSIONS 

Of  the  four  methods  examined,  the  factor  analysis  technique  was  the  most  useful 
and  came  the  closest  to  meeting  all  of  the  original  objectives.  The  correlation  technique  could  be 
used  to  determine  whether  one  spectrum  was  nearly  identical  to  another  spectrum,  but  could  not 
make  any  significant  reduction  in  the  constraint  file.  The  inspection  technique  was  useful,  but 
could  not  be  automatically  implemented  or  make  a significant  reduction  in  the  number  of 
constraints.  The  linear-programming  technique  required  too  many  subjective  judgements  and  was 
too  complex  to  select  constraints  automatically.  Its  most  useful  result  was  to  illustrate  the 
possible  limitations  in  any  classification  system. 

The  37  spectra  chosen  as  constraints  from  the  factor  analysis  technique  come  close 
to  meeting  the  requirements  for  a set  of  independent  spectra  and  are  shown  in  appendix  C. 
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APPENDIX  A 


DERIVATION  OF  FRj EQUATION 

Factor  scores  can  he  calculated  from  the  unrotated  and  rotated  loadings  without  com- 
puting all  N loadings.* 


From  (8) 


2j  = A Fj 


and 


Using  ( 1 0) 


Fj  = (A'  A)'1  A'  Zj 


f,  - A'  ^ 

which  is  the  unrotated  factor  score. 

The  rotated  factor  scores  are  a little  more  difficult.  From  (12) 


zi  - B FRj 


Therefore, 


(A-l ) 


and  using  (11) 


or 


A Fj  = BFRj 


AFj  = AT  FRj 


FRj  " T'  Fj 


since  T is  an  orthogonal  matrix. 

Combining  A-l  and  A-2, 


FRj  * T'A'lA'  Zj 


(A-2) 


(A-3) 


See  Literature  Cited  on  the  next  page. 


Using  (II)  again. 

B = AT 
A'B  = A' AT 
(A'aHa'B  = T 
A-'  A'B  = T 

Therefore, 

V = B'AA*1 

Substituting  A-4  into  A-3,  one  obtains  (14) 

Fr}  - B'A-2A'Zj 
A 

The  reconstructed  spectrum  z ■ can  be  f°und  from 

Z-  = BB'AA-2A'Zj 

LITERATURE  CITED 

Kaiser,  H.  Formulas  for  Component  Scores.  Psychometrika  27(1),  pp  83-87,  1962. 


(A-4) 


(A-5) 
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Figure  C-3.  LAB  Simulated  Dust  Constraints 
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Figure  C-5.  LAB  Simulated  Contaminant  Constraints 
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Comma  ndei 

US  Army  Test  & I valuation  Command 

Attn  DRSTE-FA  I 

Aberdeen  Proving  Ground.  MD  21005 

Commander 

US  Army  Cold  Regions  Test  Center 

Attn  STLCR-TD  1 

APO  Seattle.  WA  98733 

President 

US  Army  Infantry  Board 

Attn:  ATZB-IB-M1  I 

l ort  Be  lining.  GA  31905 

DEPARTMENT  OE  THE  NAVY 

Chief  of  Naval  Research 

Attn.  Code  443  1 

800  N Quincy  Street 
Arlington.  VA  22217 

Commander 

Naval  Facilities  Engineering  Command 

Attn:  Code  03  1 

200  Stovall  Street 
Alexandria,  VA  22332 

Commander 

Naval  Explosive  Ordnance  Disposal  Facility 

Attn:  Army  Chemical  Officer.  Code  604  I 

Indian  Head.  MD  20640 

Commander 

Nuclear  Weapons  Training  Group,  Atlantic 
Naval  Air  Station 

Attn:  Code  21  I 

Norfolk.  V A 23511 

Chief.  Bureau  of  Medicine  k Surgery  I 

Department  of  the  Navy 
Washington.  DC  20372 

Commander 

Naval  Weapons  Center 

Attn  A.  B.  Galloway  /Code  3171  I 

China  Lake.  CA  93535 


Names  Copies 

US  MARINI  CORPS 


Director,  Development  Center 

Marine  Corps  Development  k Education  Command 

Attn:  Tire  Power  Division  I 

Quantico,  VA  22134 

DEPARTMENT  Ol  Till  AIR  FORCE 


Air  University  Library 

Attn:  AUL/LSE-8879  I 

Maxwell  Al  b,  AL  36112 

HQ  Foreign  Technology  Division  (AF'SC) 

Attn:  PI)RR  I 

Wright-Patlcrson  AFB.  OH  45433 

Commander 

Aeronautical  Systems  Division 

Attn:  ASD/AELD  1 

Wright-Patterson  AFB,  OH  45433 

HQ.  USAF/SGPR  | 

Forrestal  Bldg 
WASH  DC  20314 

HO  USAF/RDPN  I 

WASH  DC  20330 

HQ  AF1SC/SEV  1 

Norton  AFB.  CA  92409 

NORAD  Combat  Operations  Center/DBN  1 

Cheyenne  Mtn  Complex,  CO  80914 


OUTSIDE  AGENCIES 

Battelle,  Columbus  Laboratories 

Attn:  TACTEC  1 

505  King  Avenue 
Columbus.  OH  43201 

Director  of  Toxicology  1 

National  Research  Council 
2101  Constitution  Ave,  NW 
Washington.  DC  20418 

ADDITIONAL  ADDRESSEES 

US  Public  Health  Service 

Room  17A-46  (CPT  Osheroff)  1 

5600  Fishers  Lane 
Rockville.  MD  20857 
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DISTRIBUTION  LIST  « (Conld) 


Names 

Commander 

I S Army  I nvuonmenlal  Hygiene  Agency 
Alin:  Librarian.  Bldg  2100 
Aberdeen  Proving  Ground,  MI)  21010 

Commander 
DARCOM.  STITEUR 
Alin:  DRXST-ST1 
Box  48.  APO  New  York  09710 


Copies  Names 

Commander 

US  Army  Science  & Technology  C< 
I APO  San  I rancisco  96328 

IIQDA  DASG-RDZ  (SORD-PLI 
WASH  DC  20314 


Copies 

I 

r-l-ar  East  Office 

I 

t 
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